Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data

نویسندگان

Alexander Krueger

Oliver Walter

Volker Leutnant

Reinhold Häb-Umbach

چکیده

In this contribution we investigate the effectiveness of BAYESIAN feature enhancement (BFE) on a medium-sized recognition task containing real-world recordings of noisy reverberant speech. BFE employs a very coarse model of the acoustic impulse response (AIR) from the source to the microphone, which has been shown to be effective if the speech to be recognized has been generated by artificially convolving nonreverberant speech with a constant AIR. Here we demonstrate that the model is also appropriate to be used in feature enhancement of true recordings of noisy reverberant speech. On the Multi-Channel Wall Street Journal Audio Visual corpus (MCWSJ-AV) the word error rate is cut in half to 41.9% compared to the ETSI Standard Front-End using as input the signal of a single distant microphone with a single recognition pass.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep neural network based spectral feature mapping for robust speech recognition

Automatic speech recognition (ASR) systems suffer from performance degradation under noisy and reverberant conditions. In this work, we explore a deep neural network (DNN) based approach for spectral feature mapping from corrupted speech to clean speech. The DNN based mapping substantially reduces interference and produces estimated clean spectral features for ASR training and decoding. We expe...

متن کامل

A Multichannel Feature Compensation Approach for Robust ASR in Noisy and Reverberant Environments

In this paper we propose a multichannel feature compensation approach for automatic speech recognition in reverberant and noisy environments. The proposed technique propagates the posterior of the clean signal estimated by a multichannel Wiener filter in short-time Fourier transform (STFT) domain into Mel-frequency cepstrum coefficients (MFCC) domain. The multichannel Wiener filter reduces both...

متن کامل

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge

This paper describes several strategies tested in BUT’s submission to the IARPA ASpIRE challenge. The ASpIRE task was to develop an automatic speech recognition (ASR) system for wide-band noisy reverberant speech, while only clean CTS (Fisher) data was allowed for ASR training. To solve this task, we have started with augmenting Fisher data with artificially noised and reverberated versions. Th...

متن کامل

Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

The recently released REverberant Voice Enhancement and Recognition Benchmark (REVERB) challenge includes a reverberant automatic speech recognition (ASR) task. This paper describes our proposed system based on multi-channel speech enhancement preprocessing and state-of-the-art ASR techniques. For preprocessing, we propose a single-channel dereverberation method with reverberation time estimati...

متن کامل

On the role of missing data imputation and NMF feature enhancement in building synthetic voices using reverberant speech

In this paper, we study the role of a recently proposed feature enhancement technique in building HMM-based synthetic voices using reverberant speech data. The feature enhancement technique studied combines the advantages of missing data imputation and non-negative matrix factorization (NMF) based methods in cleaning up the reverberant features. Speaker adaptation of a clean average voice using...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data

نویسندگان

چکیده

منابع مشابه

Deep neural network based spectral feature mapping for robust speech recognition

A Multichannel Feature Compensation Approach for Robust ASR in Noisy and Reverberant Environments

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge

Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

On the role of missing data imputation and NMF feature enhancement in building synthetic voices using reverberant speech

عنوان ژورنال:

اشتراک گذاری